Knowledge-poor and Knowledge-rich Approaches for Multilingual Terminology Extraction

نویسندگان

  • Béatrice Daille
  • Helena Blancafort
چکیده

In this paper, we present two terminology extraction tools in order to compare a knowledge-poor and a knowledge-rich approach. Both tools process single and multi-word terms and are designed to handle multilingualism. We run an evaluation on six languages and two di erent domains using crawled comparable corpora and hand-crafted reference term lists. We discuss the three main results achieved for terminology extraction. The rst two evaluation scenarios concern the knowledge-rich framework. Firstly, we compare performances for each of the languages depending on the ranking that is applied: speci city score vs. the number of occurrences. Secondly, we examine the relevancy of the term variant identi cation to increase the precision ranking for any of the languages. The third evaluation scenario compares both tools and demonstrates that a probabilistic term extraction approach, developed with minimal e ort, achieves satisfactory results when compared to a rule-based method.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards the Automated Enrichment of Multilingual Terminology Databases with Knowledge-Rich Contexts – Experiments with Russian EuroTermBank Data

Although knowledge-rich context (KRC) extraction has received a lot of attention, to our knowledge few attempts at directly feeding KRCs into a terminological resource have been undertaken. The aim of this study, therefore, is to investigate to which extent pattern-based KRC extraction can be useful for the enrichment of terminological resources. The paper describes experiments aiming at the en...

متن کامل

Multilingual Opinion Holder and Target Extraction using Knowledge-Poor Techniques

We describe an approach to multilingual sentiment analysis, in particular opinion holder and opinion target extraction, which requires no annotated data and minimal language-specific input. The approach is based on unsupervised, knowledge-poor techniques which facilitate adaptation to new languages and domains. The system's results are comparable to those of supervised, language-specific system...

متن کامل

Pazienza University of Roma Tor Vergata , Italy Armando Stellato University of Roma Tor Vergata , Italy Semi - Automatic Ontology Development : Processes and Resources

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology Extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, the authors address the extrac...

متن کامل

Mining Multiword Terms from Wikipedia

The collection of the specialized vocabulary of a particular domain (terminology) is an important initial step of creating formalized domain knowledge representations (ontologies). Terminology Extraction (TE) aims at automating this process by collecting the relevant domain vocabulary from existing lexical resources or collections of domain texts. In this chapter, the authors address the extrac...

متن کامل

BabelRelate! A Joint Multilingual Approach to Computing Semantic Relatedness

We present a knowledge-rich approach to computing semantic relatedness which exploits the joint contribution of different languages. Our approach is based on the lexicon and semantic knowledge of a wide-coverage multilingual knowledge base, which is used to compute semantic graphs in a variety of languages. Complementary information from these graphs is then combined to produce a ‘core’ graph w...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Research in Computing Science

دوره 70  شماره 

صفحات  -

تاریخ انتشار 2013